Lexical Transfer Using a Vector-Space Model
نویسنده
چکیده
Building a bilingual dictionary for transfer in a machine translation system is conventionally done by hand and is very time-consuming. In order to overcome this bottleneck, we propose a new mechanism for lexical transfer, which is simple and suitable for learning from bilingual corpora. It exploits a vector-space model developed in information retrieval research. We present a preliminary result from our computational experiment.
منابع مشابه
A Distributed Representation-Based Framework for Cross-Lingual Transfer Parsing
This paper investigates the problem of cross-lingual transfer parsing, aiming at inducing dependency parsers for low-resource languages while using only training data from a resource-rich language (e.g., English). Existing model transfer approaches typically don’t include lexical features, which are not transferable across languages. In this paper, we bridge the lexical feature gap by using dis...
متن کاملA good space: Lexical predictors in vector space evaluation
Vector space models benefit from using an outside corpus to train the model. It is, however, unclear what constitutes a good training corpus. We have investigated the effect on summary quality when using various language resources to train a vector space based extraction summarizer. This is done by evaluating the performance of the summarizer utilizing vector spaces built from corpora from diff...
متن کاملBuilding a Bilingual Representation of the Roget Thesaurus for French to English Machine Translation
This paper describes a solution to lexical transfer as a trade-off between a dictionary and an ontology. It shows its association to a translation tool based on morpho-syntactical parsing of the source language. It is based on the English Roget Thesaurus and its equivalent, the French Larousse Thesaurus, in a computational framework. Both thesaurii are transformed into vector spaces, and all mo...
متن کاملMultilingual Training of Crosslingual Word Embeddings
Crosslingual word embeddings represent lexical items from different languages using the same vector space, enabling crosslingual transfer. Most prior work constructs embeddings for a pair of languages, with English on one side. We investigate methods for building high quality crosslingual word embeddings for many languages in a unified vector space. In this way, we can exploit and combine infor...
متن کاملClassification of transformer faults using frequency response analysis based on cross-correlation technique and support vector machine
One of the most important methods for transformers fault diagnosis (especially mechanical defects) is the frequency response analysis (FRA) method. The most important step in the FRA diagnostic process is to differentiate the faults and classify them in different classes. This paper uses the intelligent support vector machine (SVM) method to classify transformer faults. For this purpose, two gr...
متن کامل